Search Result

Select

Speech steganalysis method based on deep residual network

REN Yiming, WANG Rangding, YAN Diqun, LIN Yuzhen

Journal of Computer Applications 2021, 41 (3): 774-779. DOI: 10.11772/j.issn.1001-9081.2020060763

Abstract （396）

PDF （1026KB）（710）

Save

Concerning the low detection performance of the Least Significant Bit (LSB) steganography method on WAV-format speech, a speech steganalysis method based on deep residual network was proposed. First, the residual signal of the input speech signal was calculated through a fixed convolutional layer composed of multiple sets of high-pass filters, and a truncated linear unit was adopted to perform truncation to the obtained residual signal. Then, a deep network was constructed by stacking the convolutional layer and the designed residual block to extract the deep feature information of steganography. Finally, the final classification result was output by the classifier composed of the fully connected layer and Softmax layer. Experimental results under the different secret information embedding rates of two steganography methods,Hide4PGP (Hide 4 Pretty Good Privacy) and LSBmatching (Least Significant Bit matching), show that compared with the exising Convolutional Neural Network (CNN)-based steganalysis methods, the proposed method can achieve better performance, and compared with LinNet, the proposed method has the detection accuracy increased by 7 percentage points on detecting Hide4PGP with the embedding rate of 0.1 bps (bit per sample).

Reference | Related Articles | Metrics

Select

Audio steganography detection model combing residual network and extreme gradient boosting

CHEN Lang, WANG Rangding, YAN Diqun, LIN Yuzhen

Journal of Computer Applications 2021, 41 (2): 449-455. DOI: 10.11772/j.issn.1001-9081.2020060775

Abstract （444）

PDF （1165KB）（655）

Save

Aiming at the problem that the current audio steganography detection methods have low accuracy in detecting audio steganography based on Syndrome-Trellis Codes (STC), and considering the advantages of Convolutional Neural Network (CNN) in extracting abstract features, a model for audio steganography detection combining Deep Residual Network (DRN) and eXtreme Gradient Boosting (XGBoost) was proposed. Firstly, a fixed-parameter High-Pass Filter (HPF) was used to preprocess the input audio, and features were extracted through three convolutional layers. Truncated Linear Unit (TLU) activation function was applied in the first convolutional layer to make the model adapt to the distribution of steganographic signals with low Signal-To-Noise Ratio (SNR). Then, the abstract features were further extracted by five-stage residual blocks and pooling operations. Finally, the extracted high-dimensional features were classified as inputs of the XGBoost model through fully connected layers and dropout layers. The STC steganography and the Least Significant Bit Matching (LSBM) steganography were detected respectively. Experimental results showed that when the embedding rates were 0.5 bps (bit per sample), 0.2 bps and 0.1 bps respectively, that is to say, the average number of bits modified for per audio sample equaled to 0.5, 0.2 and 0.1 respectively, the proposed model achieved average detection accuracies of 73.27%, 70.16% and 65.18% respectively for the STC steganography with a sub check matrix with height of 7, and the average detection accuracies of 86.58%, 76.08% and 72.82% respectively for the LSBM steganography. Compared with the traditional steganography detection methods based on extracting handcrafted features and deep learning steganography detection methods, the proposed model has the average detection accuracies for the two steganography algorithms both increased by more than 10 percent points.

Reference | Related Articles | Metrics

Select

Forensics algorithm of various operations for digital speech

XIANG Li, YAN Diqun, WANG Rangding, LI Xiaowen

Journal of Computer Applications 2019, 39 (1): 126-130. DOI: 10.11772/j.issn.1001-9081.2018071596

Abstract （501）

PDF （728KB）（303）

Save

Most existing forensic methods for digital speech aim at detecting a specific operation, which means that these methods can not identify various operations at a time. To solve the problem, a universal forensic algorithm for simultaneously detecting various operations, such as pitch modification, low-pass filtering, high-pass filtering, and noise adding was proposed. Firstly, the statistical moments of Mel-Frequency Cepstral Coefficients (MFCC) were calculated, and cepstrum mean and variance normalization were applied to the moments. Then, a multi-class classifier based on multiple two-class classifiers was constructed. Finally, the classifier was used to identify various types of speech operations. The experimental results on TIMIT and UME speech datasets show that the proposed universal features achieve detection accuracy over 97% for various speech operations. And the detection accuracy in the test of MP3 compression robustness is still above 96%.

Reference | Related Articles | Metrics

Select

Playback speech detection algorithm based on modified cepstrum feature

LIN Lang, WANG Rangding, YAN Diqun, LI Can

Journal of Computer Applications 2018, 38 (6): 1648-1652. DOI: 10.11772/j.issn.1001-9081.2017112822

Abstract （518）

PDF （932KB）（297）

Save

With the development of speech technology, various kinds of phishing speech represented by playback speech have brought serious challenge for voiceprint authentication system and audio forensics technology. Aiming at the attack problem of playback speech to voiceprint authentication system, a new detection algorithm based on modified cepstrum feature was proposed. Firstly, the coefficient of variation was used to analyze the difference between the original speech and the playback speech in the frequency domain. Secondly, a new filter bank composed of inverse-Mel filters and linear filters was used to replace Mel filter bank in the process of extracting Mel Frequency Cepstral Coefficients (MFCC) pertinently, and then the modified cepstrum feature based on the new filter bank was obtained. Finally, Gaussian Mixture Model (GMM) was utilized as the classifier to classify and discriminate speech. The experimental results show that, the modified cepstrum feature can effectively detect the playback speech, and its equal error rate is about 3.45%.

Reference | Related Articles | Metrics

Select

Cell-phone source identification based on spectral fusion features of recorded speech

PEI Anshan, WANG Rangding, YAN Diqun

Journal of Computer Applications 2018, 38 (3): 884-890. DOI: 10.11772/j.issn.1001-9081.2017071864

Abstract （329）

PDF （1084KB）（410）

Save

With the popularity of cell-phone recording devices and the availability of various powerful and easy to operate digital media editing software, source cell-phone identification has become a hot topic in multimedia forensics, a cell-phone source recognition algorithm based on spectral fusion features was proposed to solve this problem. Firstly, the same speech spectrograms of different cell-phones were analyzed, it was found that the speech spectral characteristics of different cell-phones were different; then the logarithmic spectrum, phase spectrum and information quantity for a speech were researched. Secondly, the three features were connected in series to form the original fusion feature, and the sample feature space was constructed with the original fusion feature of each sample. Finally, the evaluation function CfsSubsetEval of WEKA platform was selected according to the best priority search method to select features, and LibSVM was used to model training and sample recognition after feature selection. Twenty-three popular cell-phone models were evaluated in the experiment, the results showed that the proposed spectral fusion feature has higher identification accuracy for cell-phone brands than spectral single feature and the average identification accuracies achieved 99.96% and 99.91% on TIMIT database and CKC-SD database. In addition, it was compared with the source identification algorithm of Hanilci based on Mel frequency cepstral coefficients, the average identification accuracy was improved by 6.58 and 5.14 percentage points respectively. Therefore, the proposed algorithm can improve the average identification accuracy and effectively reduce the false positives rate of cell-phone source identification.

Reference | Related Articles | Metrics

Select

Recaptured speech detection algorithm based on convolutional neural network

LI Can, WANG Rangding, YAN Diqun

Journal of Computer Applications 2018, 38 (1): 79-83. DOI: 10.11772/j.issn.1001-9081.2017071896

Abstract （531）

PDF （838KB）（380）

Save

Aiming at the problems that recaptured speech attack to speaker recognition system harms the rights and interests of legitimate users, a recaptured speech detection algorithm based on Convolutional Neural Network (CNN) was proposed. Firstly, the spectrograms of the original speech and the recaptured speech were extracted and input into the CNN for feature extraction and classification. Secondly, for the detection task, a new network architecture was constructed, and the effect of the spectrograms with different window shifts were discussed. Finally, the cross-over experiments for various recapture and replay devices were constructed. The experimental results demonstrate that the proposed method can accurately discriminate whether the detected speech is recaptured or not, and the recognition rate achieves 99.26%. Compared with the mute segment Mel-Frequency Cepstral Coefficient (MFCC) algorithm, channel mode noise algorithm and long window scale factor algorithm, the recognition rate is increased by about 26 percentage points, about 21 percentage points and about 0.35 percentage points respectively.

Reference | Related Articles | Metrics

Select

Tampering detection algorithm based on noise consistency for digital voice heterologous splicing

YANG Fan, YAN Diqun, XU Hongwei, WANG Rangding, JIN Chao, XIANG Li

Journal of Computer Applications 2017, 37 (12): 3452-3457. DOI: 10.11772/j.issn.1001-9081.2017.12.3452

Abstract （433）

PDF （908KB）（596）

Save

Heterologous splicing is a typical tampering behavior for digital voice. It mainly uses the audio editing software to splice the voice clips recorded in different scenes, so as to achieve the purpose of changing the semantics of voice. Considering the difference of background noise in different scenes, a tampering detection algorithm based on noise consistency for digital voice heterologous splicing was proposed. Firstly, the Time-Recursive Averaging (TRA) algorithm was applied to extract the background noise contained in the voice to be detected. Then, the Change-Point Detection (CPD) algorithm was used to detect whether abrupt changes existed in the noise variance, which was used to determine whether the voice was tampered, and to locate the tampering position of the testing voice. The experimental results show that the proposed algorithm can achieve good performance in detecting the tampering position of heterologous splicing for digital voice.

Reference | Related Articles | Metrics

Select

Smart wireless water meter reading system for multi-story residential buildings

FU Songyin, WANG Rangding, YAO Ling, ZHANG Chengyu, SHAN Guanmin, HU Guowei

Journal of Computer Applications 2017, 37 (1): 170-174. DOI: 10.11772/j.issn.1001-9081.2017.01.0170

Abstract （558）

PDF （1000KB）（473）

Save

Smart Wireless Water Meter Reading System (SWWMRS) built on the conventional Wireless Sensor Network (WSN) platform can not meet the requirements of low cost, low power consumption, high efficiency and high reliability in practice. In this work, a novel SWWMRS for typical multi-story buildings was proposed. Based on the feature of the SWWMRS and deployment environment as well as the business logic, an improved algorithm for all neighbor discoveries was proposed to achieve automatic networking and centralized routing management. At the meter reading stage, a minimum global forward strategy with a minimum residual energy nodes avoidance strategy were adopted to balance the energy consumption between nodes. Additionally, the mechanism to avoid confliction in Media Access Control (MAC) layer and the low power idle listening strategy were optimized. The testing results for the proposed system in a 24-story residential building show that the system performance of communication distance, power consumption and reliability can meet the needs of the practical applications. Meanwhile, compared with CC2530 scheme, better performance in communication distance, meter reading success rate, efficiency and power consumption can be achieved.

Reference | Related Articles | Metrics

Select

Near field communication-enabled water meter system with mobile payment

ZHANG Chengyu, WANG Rangding, YAO Ling, FU Songyin, ZUO Fuqiang, GAO Qifei, JIANG Ming

Journal of Computer Applications 2017, 37 (1): 166-169. DOI: 10.11772/j.issn.1001-9081.2017.01.0166

Abstract （508）

PDF （650KB）（536）

Save

In view of the problems of traditional prepaid meters such as inefficiency and inconvenience, a Near Field Communication (NFC)-enabled water meter system that has the functions of mobile payment and data query was proposed. Firstly, according to the business requirements of the prepaid water meter, the overall architecture of the water meter system was developed based on NFC technology, and the software and hardware were designed. Secondly, a low-power mechanism which was used to wake up the water meter by detecting the external magnetic field changes was proposed. Finally, the security performance in mobile payment of the water meter system was analyzed based on NFC security protocols. The experimental results show that users can dynamically awake the water meter system, and utilize the functions of mobile payment, data querying and data uploading, by using the NFC mobile phones or other mobile terminals with NFC module.

Reference | Related Articles | Metrics

Select

Homomorphic compensation of recaptured image detection based on direction predict

XIE Zhe WANG Rangding YAN Diqun LIU Huacheng

Journal of Computer Applications 2014, 34 (9): 2687-2690. DOI: 10.11772/j.issn.1001-9081.2014.09.2687

Abstract （446）

PDF （769KB）（506）

Save

To resist recaptured image's attack towards face recognition system, an algorithm based on predicting face image's gradient direction was proposed. The contrast of real image and recaptured image was enhanced by adaptive Gauss homomorphic's illumination compensation. A Support Vector Machine (SVM) classifier was chosen for training and testing two kinds of pictures with convoluting 8-direction Sobel operator. Using 522 live and recaptured faces come from domestic and foreign face databases including NUAA Imposter Database and Yale Face Database for experiment, the detection rate reached 99.51%; Taking 261 live face photos using Samsung Galaxy Nexus phone, then remaked them to get 522 samples library, the detection rate was 98.08% and the time of feature extraction was 167.04s. The results show that the proposed algorithm can classify live and recaptured faces with high extraction efficiency.

Reference | Related Articles | Metrics